1,479 research outputs found
The Dependent Random Measures with Independent Increments in Mixture Models
When observations are organized into groups where commonalties exist amongst
them, the dependent random measures can be an ideal choice for modeling. One of
the propositions of the dependent random measures is that the atoms of the
posterior distribution are shared amongst groups, and hence groups can borrow
information from each other. When normalized dependent random measures prior
with independent increments are applied, we can derive appropriate exchangeable
probability partition function (EPPF), and subsequently also deduce its
inference algorithm given any mixture model likelihood. We provide all
necessary derivation and solution to this framework. For demonstration, we used
mixture of Gaussians likelihood in combination with a dependent structure
constructed by linear combinations of CRMs. Our experiments show superior
performance when using this framework, where the inferred values including the
mixing weights and the number of clusters both respond appropriately to the
number of completely random measure used
A non-parametric conditional factor regression model for high-dimensional input and response
In this paper, we propose a non-parametric conditional factor regression
(NCFR)model for domains with high-dimensional input and response. NCFR enhances
linear regression in two ways: a) introducing low-dimensional latent factors
leading to dimensionality reduction and b) integrating an Indian Buffet Process
as a prior for the latent factors to derive unlimited sparse dimensions.
Experimental results comparing NCRF to several alternatives give evidence to
remarkable prediction performance.Comment: 9 pages, 3 figures, NIPS submissio
An Adaptive Online HDP-HMM for Segmentation and Classification of Sequential Data
In the recent years, the desire and need to understand sequential data has
been increasing, with particular interest in sequential contexts such as
patient monitoring, understanding daily activities, video surveillance, stock
market and the like. Along with the constant flow of data, it is critical to
classify and segment the observations on-the-fly, without being limited to a
rigid number of classes. In addition, the model needs to be capable of updating
its parameters to comply with possible evolutions. This interesting problem,
however, is not adequately addressed in the literature since many studies focus
on offline classification over a pre-defined class set. In this paper, we
propose a principled solution to this gap by introducing an adaptive online
system based on Markov switching models with hierarchical Dirichlet process
priors. This infinite adaptive online approach is capable of segmenting and
classifying the sequential data over unlimited number of classes, while meeting
the memory and delay constraints of streaming contexts. The model is further
enhanced by introducing a learning rate, responsible for balancing the extent
to which the model sustains its previous learning (parameters) or adapts to the
new streaming observations. Experimental results on several variants of
stationary and evolving synthetic data and two video datasets, TUM Assistive
Kitchen and collatedWeizmann, show remarkable performance in segmentation and
classification, particularly for evolutionary sequences with changing
distributions and/or containing new, unseen classes.Comment: 23 pages, 9 figures and 4 table
Bayesian nonparametric image segmentation using a generalized Swendsen-Wang algorithm
Unsupervised image segmentation aims at clustering the set of pixels of an
image into spatially homogeneous regions. We introduce here a class of Bayesian
nonparametric models to address this problem. These models are based on a
combination of a Potts-like spatial smoothness component and a prior on
partitions which is used to control both the number and size of clusters. This
class of models is flexible enough to include the standard Potts model and the
more recent Potts-Dirichlet Process model \cite{Orbanz2008}. More importantly,
any prior on partitions can be introduced to control the global clustering
structure so that it is possible to penalize small or large clusters if
necessary. Bayesian computation is carried out using an original generalized
Swendsen-Wang algorithm. Experiments demonstrate that our method is competitive
in terms of RAND\ index compared to popular image segmentation methods, such as
mean-shift, and recent alternative Bayesian nonparametric models
Smoothed Hierarchical Dirichlet Process: A Non-Parametric Approach to Constraint Measures
Time-varying mixture densities occur in many scenarios, for example, the
distributions of keywords that appear in publications may evolve from year to
year, video frame features associated with multiple targets may evolve in a
sequence. Any models that realistically cater to this phenomenon must exhibit
two important properties: the underlying mixture densities must have an unknown
number of mixtures, and there must be some "smoothness" constraints in place
for the adjacent mixture densities. The traditional Hierarchical Dirichlet
Process (HDP) may be suited to the first property, but certainly not the
second. This is due to how each random measure in the lower hierarchies is
sampled independent of each other and hence does not facilitate any temporal
correlations. To overcome such shortcomings, we proposed a new Smoothed
Hierarchical Dirichlet Process (sHDP). The key novelty of this model is that we
place a temporal constraint amongst the nearby discrete measures in
the form of symmetric Kullback-Leibler (KL) Divergence with a fixed bound .
Although the constraint we place only involves a single scalar value, it
nonetheless allows for flexibility in the corresponding successive measures.
Remarkably, it also led us to infer the model within the stick-breaking process
where the traditional Beta distribution used in stick-breaking is now replaced
by a new constraint calculated from . We present the inference algorithm and
elaborate on its solutions. Our experiment using NIPS keywords has shown the
desirable effect of the model
Cooperative Hierarchical Dirichlet Processes: Superposition vs. Maximization
The cooperative hierarchical structure is a common and significant data
structure observed in, or adopted by, many research areas, such as: text mining
(author-paper-word) and multi-label classification (label-instance-feature).
Renowned Bayesian approaches for cooperative hierarchical structure modeling
are mostly based on topic models. However, these approaches suffer from a
serious issue in that the number of hidden topics/factors needs to be fixed in
advance and an inappropriate number may lead to overfitting or underfitting.
One elegant way to resolve this issue is Bayesian nonparametric learning, but
existing work in this area still cannot be applied to cooperative hierarchical
structure modeling.
In this paper, we propose a cooperative hierarchical Dirichlet process (CHDP)
to fill this gap. Each node in a cooperative hierarchical structure is assigned
a Dirichlet process to model its weights on the infinite hidden factors/topics.
Together with measure inheritance from hierarchical Dirichlet process, two
kinds of measure cooperation, i.e., superposition and maximization, are defined
to capture the many-to-many relationships in the cooperative hierarchical
structure. Furthermore, two constructive representations for CHDP, i.e.,
stick-breaking and international restaurant process, are designed to facilitate
the model inference. Experiments on synthetic and real-world data with
cooperative hierarchical structures demonstrate the properties and the ability
of CHDP for cooperative hierarchical structure modeling and its potential for
practical application scenarios
Learning Hidden Structures with Relational Models by Adequately Involving Rich Information in A Network
Effectively modelling hidden structures in a network is very practical but
theoretically challenging. Existing relational models only involve very limited
information, namely the binary directional link data, embedded in a network to
learn hidden networking structures. There is other rich and meaningful
information (e.g., various attributes of entities and more granular information
than binary elements such as "like" or "dislike") missed, which play a critical
role in forming and understanding relations in a network. In this work, we
propose an informative relational model (InfRM) framework to adequately involve
rich information and its granularity in a network, including metadata
information about each entity and various forms of link data. Firstly, an
effective metadata information incorporation method is employed on the prior
information from relational models MMSB and LFRM. This is to encourage the
entities with similar metadata information to have similar hidden structures.
Secondly, we propose various solutions to cater for alternative forms of link
data. Substantial efforts have been made towards modelling appropriateness and
efficiency, for example, using conjugate priors. We evaluate our framework and
its inference algorithms in different datasets, which shows the generality and
effectiveness of our models in capturing implicit structures in networks
Diverse Online Feature Selection
Online feature selection has been an active research area in recent years. We
propose a novel diverse online feature selection method based on Determinantal
Point Processes (DPP). Our model aims to provide diverse features which can be
composed in either a supervised or unsupervised framework. The framework aims
to promote diversity based on the kernel produced on a feature level, through
at most three stages: feature sampling, local criteria and global criteria for
feature selection. In the feature sampling, we sample incoming stream of
features using conditional DPP. The local criteria is used to assess and select
streamed features (i.e. only when they arrive), we use unsupervised scale
invariant methods to remove redundant features and optionally supervised
methods to introduce label information to assess relevant features. Lastly, the
global criteria uses regularization methods to select a global optimal subset
of features. This three stage procedure continues until there are no more
features arriving or some predefined stopping condition is met. We demonstrate
based on experiments conducted on that this approach yields better compactness,
is comparable and in some instances outperforms other state-of-the-art online
feature selection methods
Dependent Indian Buffet Process-based Sparse Nonparametric Nonnegative Matrix Factorization
Nonnegative Matrix Factorization (NMF) aims to factorize a matrix into two
optimized nonnegative matrices appropriate for the intended applications. The
method has been widely used for unsupervised learning tasks, including
recommender systems (rating matrix of users by items) and document clustering
(weighting matrix of papers by keywords). However, traditional NMF methods
typically assume the number of latent factors (i.e., dimensionality of the
loading matrices) to be fixed. This assumption makes them inflexible for many
applications. In this paper, we propose a nonparametric NMF framework to
mitigate this issue by using dependent Indian Buffet Processes (dIBP). In a
nutshell, we apply a correlation function for the generation of two stick
weights associated with each pair of columns of loading matrices, while still
maintaining their respective marginal distribution specified by IBP. As a
consequence, the generation of two loading matrices will be column-wise
(indirectly) correlated. Under this same framework, two classes of correlation
function are proposed (1) using Bivariate beta distribution and (2) using
Copula function. Both methods allow us to adopt our work for various
applications by flexibly choosing an appropriate parameter settings. Compared
with the other state-of-the art approaches in this area, such as using Gaussian
Process (GP)-based dIBP, our work is seen to be much more flexible in terms of
allowing the two corresponding binary matrix columns to have greater variations
in their non-zero entries. Our experiments on the real-world and synthetic
datasets show that three proposed models perform well on the document
clustering task comparing standard NMF without predefining the dimension for
the factor matrices, and the Bivariate beta distribution-based and Copula-based
models have better flexibility than the GP-based model.Comment: 14 pages, 10 figure
Nonparametric Relational Topic Models through Dependent Gamma Processes
Traditional Relational Topic Models provide a way to discover the hidden
topics from a document network. Many theoretical and practical tasks, such as
dimensional reduction, document clustering, link prediction, benefit from this
revealed knowledge. However, existing relational topic models are based on an
assumption that the number of hidden topics is known in advance, and this is
impractical in many real-world applications. Therefore, in order to relax this
assumption, we propose a nonparametric relational topic model in this paper.
Instead of using fixed-dimensional probability distributions in its generative
model, we use stochastic processes. Specifically, a gamma process is assigned
to each document, which represents the topic interest of this document.
Although this method provides an elegant solution, it brings additional
challenges when mathematically modeling the inherent network structure of
typical document network, i.e., two spatially closer documents tend to have
more similar topics. Furthermore, we require that the topics are shared by all
the documents. In order to resolve these challenges, we use a subsampling
strategy to assign each document a different gamma process from the global
gamma process, and the subsampling probabilities of documents are assigned with
a Markov Random Field constraint that inherits the document network structure.
Through the designed posterior inference algorithm, we can discover the hidden
topics and its number simultaneously. Experimental results on both synthetic
and real-world network datasets demonstrate the capabilities of learning the
hidden topics and, more importantly, the number of topics
- …